首页> 外文OA文献 >Bilingual term alignment from comparable corpora in English discharge summary and Chinese discharge summary
【2h】

Bilingual term alignment from comparable corpora in English discharge summary and Chinese discharge summary

机译:英语出院摘要和中文出院摘要中可比较语料库的双语术语对齐

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Background: Electronic medical record (EMR) systems have become widely used throughout the world to improve the quality of healthcare and the efficiency of hospital services. A bilingual medical lexicon of Chinese and English is needed to meet the demand for the multi-lingual and multi-national treatment. We make efforts to extract a bilingual lexicon from English and Chinese discharge summaries with a small seed lexicon. The lexical terms can be classified into two categories: single-word terms (SWTs) and multi-word terms (MWTs). For SWTs, we use a label propagation (LP; context-based) method to extract candidates of translation pairs. For MWTs, which are pervasive in the medical domain, we propose a term alignment method, which firstly obtains translation candidates for each component word of a Chinese MWT, and then generates their combinations, from which the system selects a set of plausible translation candidates. Results: We compare our LP method with a baseline method based on simple context-similarity. The LP based method outperforms the baseline with the accuracies: 4.44% Acc1, 24.44% Acc10, and 62.22% Acc100, where AccN means the top N accuracy. The accuracy of the LP method drops to 5.41% Acc10 and 8.11% Acc20 for MWTs. Our experiments show that the method based on term alignment improves the performance for MWTs to 16.22% Acc10 and 27.03% Acc20. Conclusions: We constructed a framework for building an English-Chinese term dictionary from discharge summaries in the two languages. Our experiments have shown that the LP-based method augmented with the term alignment method will contribute to reduction of manual work required to compile a bilingual sydictionary of clinical terms.
机译:背景:电子病历(EMR)系统已在世界范围内广泛使用,以提高医疗质量和医院服务效率。需要中英文双语医学词典来满足对多语言和多民族治疗的需求。我们努力从英文摘要和中文摘要中提取双语词典,并使用小的种子词典。词汇术语可以分为两类:单词术语(SWT)和多词术语(MWT)。对于SWT,我们使用标签传播(LP;基于上下文)方法来提取翻译对的候选项。对于在医学领域中普遍存在的MWT,我们提出了一种术语对齐方法,该方法首先获取中文MWT每个组成词的翻译候选,然后生成它们的组合,系统从中选择一组可能的翻译候选。结果:我们将LP方法与基于简单上下文相似性的基线方法进行了比较。基于LP的方法在准确性方面优于基线:4.44%Acc1、24.44%Acc10和62.22%Acc100,其中AccN表示最高的N精度。 LP方法的精度对于MWT下降到5.41%Acc10和8.11%Acc20。我们的实验表明,基于术语对齐的方法将MWT的性能提高到16.22%Acc10和27.03%Acc20。结论:我们构建了一个框架,用于根据两种语言的放电摘要来构建英汉术语词典。我们的实验表明,以术语对齐方法为基础的基于LP的方法将有助于减少编写临床术语双语语法所需的体力劳动。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号